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PREFACE 

The  major  objective  of  ARPA’s  Network  Secure  Communication  (NSC)  project  is  to 
develop  and  demonstrate  the  feasibility  of  secure,  high-quality,  low-bandwidth,  real-time, 
full-duplex  (two-way)  digital  voice  communications  over  packet-switched  computer 
communication  networks.  This  kind  of  communication  is  a very  high  priority  military  goal 
for  all  levels  of  command  and  control  activities.  ARPA’s  NSC  project  will  supply  digitized 
speech  which  can  be  secured  by  existing  encryption  devices.  The  major  goal  of  this 
research  is  to  demonstrate  a digital  high-quality,  low-bandwidth,  secure  voice  handling 
capability  as  part  of  the  general  military  requirement  for  worldwide  secure  voice 
communication.  The  development  at  ISI  of  the  Network  Voice  Protocol  described  herein  is 
an  important  part  of  the  total  effort. 
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1.  INTRODUCTION 


Currently,  computer  communication  networks  are  designed  for  data  transfer.  Since 
there  is  a growing  need  for  communication  of  real-time  interactive  voice  over  computer 
networks,  new  communication  discipline  must  be  developed.  The  current  HOST-to-HOST 
protocol  of  the  ARPANET,  which  was  designed  (and  optimized)  for  data  transfer,  was  found 
unsuitable  for  real-time  voice  communication.  Therefore  this  Network  Voice  Protocol 
(NVP)  was  designed  and  implemented. 

Important  design  objectives  of  the  NVP  are: 

• Recovery  from  loss  of  any  message  without  catastrophic  effects. 
Therefore  all  answers  have  to  be  unambiguous,  in  the  sense  that  it  must 
be  clear  to  which  inquiry  a reply  refers. 

• Design  such  that  no  system  can  tie  up  the  resources  of  another  system 
unnecessarily. 

• Avoidance  of  end-to-end  retransmission. 

• Separation  of  control  signals  from  data  traffic. 

• Separation  of  vocoding-dependent  parts  from  vocoding-independent  parts. 

• Adaptation  to  the  dynamic  network  performance. 

• Optimal  performance,  i.e.  guaranteed  required  bandwidth,  and  minimized 
maximum  delay. 

• Independence  from  lower  level  protocols. 


The  protocol  consists  of  two  parts: 

(1)  the  control  protocol, 

(2)  the  data  protocol. 
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Control  messages  are  sent  as  controlled  (TYPE  0/0)  messages,  and  data  messages 
may  be  sent  as  either  controlled  (TYPE  0/0)  or  uncontrolled  (TYPE  0/3)  messages. 

Throughout  this  document  a "word"  means  a "16-bit  quantity". 


*See  BBN  Report  1822  for  definition  of  MESSAGE-TYPE. 
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2.  THE  CONTROL  PROTOCOL 


Throughout  this  document  the  12-bit  MESSAGE-ID  (see  BBN  Report  1822)  is 
referred  to  as  LINK  (its  8 MSBs)  and  SUB-LINK  (its  4 LSBs). 

The  control  protocol  starts  with  an  initial  connection  phase  on  link  377  and 
continues  on  other  links  assigned  at  run  time. 

Four  links  are  used  for  each  voice  communication: 


Link  L will  be  used 

Link  K will  be  used 

Link  L+l  will  be  used 

Link  K+l  will  be  used 


for  control, 

from 

for  control, 

from 

for  data, 

from 

for  data, 

from 

CALLER  to  ANSWERER. 
ANSWERER  to  CALLER. 
CALLER  to  ANSWERER. 
ANSWERER  to  CALLER. 


Both  L and  K should  be  between  340  and  375  (octal).  L and  K need  not  differ. 


The  first  message  (CALLER  to  ANSWERER)  on  link  377  indicates  which  user  wants  to 
talk  to  whom  and  specifies  K.  As  a response  (on  K),  the  ANSWERER  either  refuses  the  call 
or  accepts  it  and  assigns  L. 

The  CALLER  then  calls  again  (this  time  on  link  L).  The  ANSWERER  initiates  a 
negotiation  session  to  verify  the  compatibility  of  the  two  parties. 

The  negotiation  consists  of  suggestions  put  forth  by  one  of  the  parties,  which  are 
either  accepted  or  rejected  by  the  other  party.  The  suggesting  party  in  the  negotiation  is 
called  the  NEGOTIATION  MASTER.  The  other  party  is  called  the  NEGOTIATION  SLAVE. 
Usually  the  ANSWERER  is  the  negotiation  master,  unless  agreed  otherwise  by  the  method 
described  later. 


If  the  negotiation  fails,  either  party  may  terminate  the  call  by  sending  a "GOODBYE". 
If  the  negotiation  is  successfully  ended,  the  ANSWERER  rings  bells  to  draw  human  attention 
and  sends  "RINGING"  to  the  CALLER.  When  the  call  is  answered  (by  a human),  a "READY"  is 
sent  to  the  CALLER  and  the  data  starts  flowing  (on  L+l  and  K+l).  However,  a "READY"  can 
be  sent  without  a preceding  "RINGING". 
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This  bell  ringing  occurs  only  after  the  initial  call  (not  after  renegotiation). 

The  assignment  of  L and  K cannot  be  changed  after  the  initial  connection  phase. 

Only  one  control  message  can  be  sent  in  a network  message.  Extra  bits  needed  to 
fill  the  network  message  are  ignored. 

The  length  of  control  messages  should  never  exceed  a single  packet  (i.e.,  1,007  data 

bits). 

Control  messages  not  recognized  by  their  receiver  should  be  ignored  and  should  not 
cause  any  error  condition  resuting  in  termination  of  the  connection.  These  messages  may 
result  from  differences  in  implementation  level  between  systems. 

SUMMARY  OF  THE  CONTROL  MESSAGES 

• 1 " 1 ,<WH0>,<WH0fvi>,K" 

#2  "2,<C0DE>"  or  only  "2" 

#3  "3,<WHAT  >,<N>,<H0W(  1 ),.,.H0W(N)>" 

#4  "4,<WHAT  >,<H0W>" 

#5  "5,<WHAT >,<H0W>"  or  only  "5,<WHAT>" 

#6  "6,L"  or  only  p6" 

#7  "7" 

#8  "A" 

#9  "9" 

#10  "10,<ID>" 

#11  "1 1,<ID>" 

#12  "12,<IM>" 

#13  "13,<YM>,<0K>" 


DEFINITION  OF  THE  CONTROL  MESSAGES 


#1  CALLING  {on  ? 77  and  L) 

This  call  is  issued  first  on  link  377  and  later  on  link  L.  Its  format  is 
"l,<WHO>,<WHOM>,K",  where  <WHO>  and  <WHOM>  are  words  which  identify 
respectively  the  calling  party  and  the  party  being  called,  and  K is  as 
defined  above.  The  format  of  the  <WHO>  and  <WHOM>  is 

(HHI 1 1 1 1 IXXXXXXXX) 

where  HH  are  2 bits  identifying  the  HOST,  followed  by  6 bits  identifying 
the  IMP,  followed  by  8 bits  identifying  the  extension  (needed  because 
there  may  be  more  than  one  communication  unit  on  the  same  HOST). 

The  system  which  sends  this  message  is  defined  as  the  CALLER,  and  the 
other  system  is  defined  as  the  ANSWERER. 

#2  GOODBYE  (TERMINATION,  on  L or  K) 

This  message  has  the  purpose  Of  terminating  calls  at  any  stage. 

ICP  can  be  terminated  (on  K)  either  negatively  by  sending  either  a single 
word  "2"  ("GOODBYE")  or  the  two  words  "2,<CODE>",  or  positively  by 
sending  the  two  words  "6,L",  as  described  later. 

After  the  initial  connection  phase,  calls  can  be  terminated  by  either  the 
CALLER  (on  L)  or  the  ANSWERER  (on  K).  This  termination  has  two  words: 
"2,<CODE>",  where  <C0DE>  is  the  reason  for  the  termination,  as  specified 
here: 


0.  Other  than  the  following. 

1.  I am  busy. 

2.  I am  not  authorized  to  talk  with  you. 

3.  Request  of  my  user. 

4.  We  believe  you  are  down. 

5.  Systems  incompatibility  (NEGOTIATION  failure). 
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6.  We  have  problems. 

7.  I am  in  a conference  now. 

8.  You  made  a protocol  error. 

#3  NEGOTIATION  INQUIRY  (on  L or  K) 

Sent  by  the  NEGOTIATION  MASTER  'for  compatibility  verification.  The 
format  is 

"3,<WHAT>,<LIST-LENGTH>,<HOW-LIST>",  meaning 

"CAN-YOU-DO,<WHAT>,<LIST-LENGTH>,<HOW-LIST>". 

The  <HOW-LIST>  is  a list  of  pointers  into  agreed-upon  tables,  as  shown 
below. 

#4  POSITIVE  NEGOTIATION  RESPONSE  (on  L or  K) 

Sent  by  the  NEGOTIATION  SLAVE  in  response  to  a NEGOTIATION  INQUIRY. 
The  format  is 

"41<WHAT>,<HOW>",  meaning:  "l-CAN-DC,<WHAT>,<HOW>". 

#5  NEGATIVE  NEGOTIATION  RESPONSE  (on  L or  K) 

Sent  by  the  NEGOTIATION  SLAVE  in  response  to  a NEGOTIATION  INQUIRY. 
The  format  is  either 

"5,<WHAT>,0",  meaning  "l-CAN’T-DO-<WHAT>-IN-ANY-OF-THESE-WAYS", 

or  "5,<WHAT>,N",  meaning  inability  to  accept  any  of  the  options  offered  in 
the  INQUIRY,  but  using  "N"  as  a suggestion  to  the  ANSWERER  about 
another  possibility.  Examples  are  presented  later  in  this  report. 

#6  REAOY  (on  L or  K) 

Sent  by  either  party  to  indicate  readiness  to  accept  data.  Its  format  is 
"6,L"  in  the  reply  to  the  initial  call,  and  "6"  thereafter. 

*7  NOT  REAOY  (on  L or  K) 

Sent  by  either  party  to  indicate  unreadiness  to  accept  data.  It  is  always 
a single  word:  "7". 


#8  INQUIRY  (on  L or  K) 

Sent  by  either  party  to  inquire  about  the  status  of  the  other.  It  is  always 
a single  word:  "8".  It  is  answered  by  #6,  #7,  or  #9. 

#9  RINGING  (on  K) 

Sent  by  the  ANSWERER  after  the  negotiations  have  been  successfully 
terminated  and  human  permission  is  needed  to  proceed  further.  The 
ringing  will  continue  for  10  seconds,  and  then  stop,  unless  a #8  is 
received.  This  message  is  always  a single  word:  "9". 

#10  ECHO  REQUEST  (on  L or  K) 

Sent  by  whichever  party  is  interested  in  measuring  the  network  delays. 
Its  only  purpose  is  to  be  echoed  immediately.  The  format  is  "10,<ID>", 
where  <ID>  is  any  word  used  to  identify  the  ECHO. 

#11  ECHO  (on  L or  K) 

Sent  in  response  to  ECHO  REQUEST.  The  format  is  "11,<ID>",  where  <ID> 
is  the  v/ord  specified  by  #10.  The  implementation  of  this  feature  is  not 
compulsory,  and  no  connection  should  be  terminated  due  to  lack  of 
response  to  ECHO  REQUEST. 

#12  RENEGOTIATION  REQUEST  (on  L or  K) 

Can  be  sent  by  either  party  at  any  stage  after  LINKS  are  agreed  upon. 
This  message  consists  of  the  two  words  "12,<IM>".  If  the  word  <IM>  (for 
I MASTER)  is  nonzero,  the  sender  of  this  message  requests  to  be  the 
NEGOTIATION  MASTER.  If  it  is  zero,  the  receiver  of  this  message  is 
requested  to  be  the  NEGOTIATION  MASTER.  Renegotiation  is  described 
later. 

#13  RENEGOTIATION  APPROVAL  (on  L or  K) 

This  message  may  be  sent  by  either  party  in  response  to  RENEGOTIATION 
REQUEST,  ii  consists  of  the  three  words  "13,<YM><0K>".  If  <OK>  is 
nonzero,  this  is  a positive  acknowledgment  (approval).  If  it  is  zero,  this  is 
a negative  acknowledgment  (i.e.,  refusal).  <YM>  is  set  to  be  equal  to  the 
<IM>  of  #12,  for  identification  purposes. 

Messages  #7,  #8,  and  #9  are  always  a single  word.  Messages  #1,  #3,  #4,  and  #5 
are  several  words  long.  Messages  #2  and  #6  are  either  a single  word  or  two  words  long. 
Messages  #10,  #11,  and  #12  are  always  2 words  long.  Message  #13  is  always  3 words 
long.  Message  #1  is  always  4 words  long. 
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Message  #1  is  sent  c.ily  by  the  CALLER,  #3  only  by  the  NEGOTIATION  MASTER,  and 
#4  and  #5  only  by  the  NEGOTIATION  SLAVE.  Message  #9  is  sent  only  by  the  ANSWERER. 
All  the  other  control  messages  may  be  sent  by  either  party. 

The  last  <HOW>  which  was  both  suggested  by  the  NEGOTIATION  MASTER  (in  #3)  and 
accepted  by  the  NEGOTIATION  SLAVE  (in  #4)  for  each  <WHAT>  is  assumed  to  be  in  use. 


DEFINITION  OF  THE  <WIIAT>  AND  <II0W>  NEGOTIATION  TABLES 
<IFIIAT>  <IIOW> 


1.  VOCOOING 


2.  SAMPLE  PERIOD 

(in  microseconds) 

3.  VERSION 


4.  MAX  MSG  LENGTH  (in  bits) 

NVP  header  included  (32  bits) 
but  not  HOST/IMP  leader 
and  not  HOST/IMP  padding 

5.  If  LPC: 

Degree 

If  CVSD: 

Time  Constant 
(in  milliseconds) 

6.  Samples  per  Parcel 

7.  If  LPC: 

Acoustic  Coding 

8.  If  LPC: 

Info  Coding 


* 1.  LPC 
+ 2.  CVSD 

3.  RELP 

4.  DELCO 


N.  N (*150) (+62) 

* 1.  VI  (see  definition  below) 
+ 2.  V2  (see  definition  below) 


N.N  (*976) (+976) 


N.  For  N coefficients  (*  10) 


N.  N (+50) 

N.  N (*128)  (+224) 


* 1.  , SIMPLE  (see  below) 
2.  OPTIMIZED 

* 1.  SIMPLE  (see  below) 
2.  OPTIMIZED 
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9.  If  LPC: 

Pre-emphasis 
I'M  [Z'1] 
N - 64  x n 

10.  If  LPC: 
Tables-Set 


N.  N (*58,  for 

H = 58/64  «*  0.90625) 


N.  N (*1) 

See  definition  of  Set  #1  in 
Appendix  1 


(*  indicates  recommended  options  for  LPC) 
(+  indicates  recommended  options  for  CVSD) 


No  parameter  (<WHAT>)  should  be  inquired  about  by  the  NEGOTIATION  MASTER  if 
some  option  for  it  (<HOW>)  has  been  previously  accepted  by  the  NEGOTIATION  SLAVE 
implicitly  in  the  "VERSION".  The  purpose  of  this  restriction  is  to  avoid  a possible  conflict 
between  individual  parameters  and  the  VERSION-option. 

Version  1 (VI)  is  defined  as 


1-1 

LPC 

2-150 

150  microseconds  sampling 

3-1 

VI 

5-10 

10  coefficients 

6-128 

128  samples  per  parcel 

7-1 

SIMPLE  acoustic  coding 

8-1 

SIMPLE  information  coding 

9-58 

|i  - 58/64  - 0.9062^ 

10-1 

Tab!es-Set-#1 

Version  2 (V2)  is  defined  as 

1- 2  CVSO 

2- 62  62  microseconds  sampling  (16  KHz  sampling) 

3- 2  V2 

5- 50  50  milliseconds  time  constant 

6- 192  192  samples  per  parcel 

Note  that  this  defines  every  negotiated  parameter  except  MAX  MSG  LENGTH. 
SIMPLE  and  OPTIMIZED  codings  will  be  described  below  in  Section  3. 
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All  the  negotiation  is  managed  by  the  NEGOTIATION  MASTER,  who  decides  how  much 
negotiation  is  needed,  and  what  to  do  in  case  some  discrepancy  (incompatibility)  is 
discovered:  either  tp  try  alternative  options  or  to  abort  the  connection.  Upon  comp'etion 
of  successful  negotiation,  the  NEGOTIATION  MASTER  sends  either  #9  (RINGING)  only  if  it  is 
the  ANSWERER  and  if  this  is  an  initial  connection,  else  it  sends  #6  (READY-FOR-DATA),  and 
probably  inquires  with  #8  about  the  readiness  of  the  other  party.  The  inquiries  (#8) 
before  the  successful  completion  of  the  negotiation  are  ignored.  However,  these  inquiries 
after  the  first  RINGING  (#9)  and  before  the  first  READY  (#6)  are  needed  to  Keep  the 
ANSWERER  ringing. 

Note  that  the  negotiation  process  can  be  shortened  by  using  the  VERSION  option,  as 
shown  in  the  examples  that  follow. 


ON  RENEGOTIATION 

At  any  stage  after  links  are  agreed  upon,  either  party  might  request  a 
RENEGOTIATION.  If  the  request  is  approved  by  the  other  party,  either  party  might  become 
the  NEGOTIATION  MASTER,  depending  on  the  type  of  renegotiation  request.  When 
renegotiation  starts,  no  previously  negotiated  agreements  (except  LINK  numbers)  hold,  and 
all  items  have  to  be  renegotiated  from  scratch.  Note  that  renegotiation  may  entirely 
replace  the  negotiation  phase  and  allows  the  CALLER  to  be  the  NEGOTIATION  MASTER. 

Upon  issuance  (or  reception)  of  RENEGOTIATION  REQUEST,  all  data  messages  are 
ignored  until  the  positive  indication  of  the  successful  completion  of  the  renegotiation  (#6). 

After  the  completion  of  renegotiation,  the  frame-count  (see  the  section  on 
MESSAGE-HEADER)  may  be  reset  to  zero. 


THE  HEADER  OE  DATA  MESSAGES 

Data  messages  are  the  messages  which  contain  vocoded  speech.  The  first  32  bits 
of  each  data  message  form  the  MESSAGE-HEADER,  which  carries  sequence  and  timing 
information  as  described  below. 

For  each  vocoding  scheme  a "FRAME"  is  defined  as  the  transmission  interval  (as 
agreed  upon  at  the  negotiation  stage  in  <WHAT#6>).  Since  this  interval  is  defined  by  the 
number  of  samples,  its  duration  can  be  found  by  multiplying  the  sampling  period 
<WHAT#2>  by  the' interval  length  (in  samples)  <WHAT#6>.  For  example,  in  VI  the  sampling 
period  is  150  microseconds  and  the  transmission  interval  is  128  samples,  which  yields 


128*150  microseconds  = 19,2  milliseconds. 


The  data  describing  a FRAME  is  called  a PARCEL  Each  parcel  has  a serial  number. 
The  first  parcel  created  after  the  completion  of  the  negotiation  (or  every  RENEGOTIATION) 
has  the  serial  number  zero.  Each  message  contains  an  integral  number  of  parcels. 

The  serial  number  of  the  first  parcel  in  the  message  is  put  in  the  first  16  bits  of  the 
message  and  is  referred  to  as  the  MESSAGE-TIME-STAMP.  Note  that  this  time  stamp  is 
synchronized  with  the  data  stream.  Note  also  that  these  16  bits  are  actually  the  third 
word  of  the  message,  following  the  2 words  used  as  IMP-to-HOST  leader  (see  BBN  Report 
1822). 


The  next  bit  in  the  header  is  the  WE-SKIPPED-PARCELS  bit,  which  is  described  later. 
The  next  7 bits  tell  how  many  parcels  there  are  in  the  message;  this  number  is  called  the 
COUNT,  or  the  PARCEL-COUNT. 

Note  that  if  message  number  N has  the  time  stamp  T(N)  and  the  count  C(N),  then 
T(N+1)  must  be  greater  than  or  equal  to  T(N)+C(N).  Usually  T(N+1)  = T(N)+C(N),  unless 
the  Transmitter  decided  not  to  send  some  parcels  due  to  silence.  If  this  happens,  then  the 
WE-SKIPPED-PARCELS  bit  is  set  to  ONE,  else  it  is  set  to  ZERO.  Hence,  if  T(N+1)  is  found 
by  the  Receiver  to  be  greater  than  T(N)+C(N)  and  the  WE-SKIPPED-PARCELS  is  zero,  some 
message  must  be  lost. 

Note  that  by  definition  the  time  stamps  on  messages  monotonically  increase,  except 
for  wraparound. 

The  message  header  structure  is  illustrated  by  the  following  diagram: 


UORO  1 UORD  2 WORD  3 UORD  4 

I I I I I 

IP000TTTTHHI 1 1 1 1 1 I LLLLLLLLZZZZZZZZ I TTTTTTTTTTTTTTTT I UCCCCCCCSSSSSSSS  I DDD. . . 

I I I It I 

I < — HOST/IMP  OR  IMP/HOST  LEADER — > I < — TIME  STAMP — > I t<COUNTx-SAVE->  I <-DATA 

t 

UE-SKIPPED-PARCELS 

P - PRIORITY  (one  bit  - 1) 

T - MESSAGE  TYPE  (4  bits  - 0011) 

L - link  ("L"  OR  "K",  8 bits,  greater  than  337  octal) 

D - data  bits  (from  here  to  the  end  of  the  message) 

ZZZZZZZZ  - 8 ZERO  bits 

HHIIIIII  * HOST  (8  bits,  destination  or  source) 

CCCCCCC  - parcel  COUNT  (7  bits) 

SSSSSSSS  » 8 bits  saved  for  future  applications 
TTTTTTTTTTTTTTTT  - TIME  STAMP  (16  bits) 


The  first  parcel  sent  by  either  party  after  the  NEGOTIATION  or  RENEGOTIATION 
should  have  the  serial  number  set  to  zero. 


During  silence  periods,  the  Transmitter  might  send  a "6"  or  "7"  message  periodically. 
If  it  does  not  do  so,  the  Receiver  might  interrogate  the  viability  of  the  Transmitter  by 
sending  periodically  "8"  ("ARE-YOU-THERE?")  or  #10  (ECHO  REQUEST)  messages. 
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3.  THE  LPC  DATA  PROTOCOL 


The  DATA  sent  at  each  transmission  interval  is  called  a PARCEL. 

Network  messages  always  contain  an  integral  number  of  PARCELS. 

There  are  two  independent  issues  in  the  coding.  One  is,  obviously,  the  acoustic 
coding,  i.e.,  which  parameters  have  to  be  transmitted.  SIMPLE  acoustic  coding  is  sending 
all  the  parameters  at  every  transmission  interval.  OPTIMIZED  acoustic  coding  sends  only 
as  little  as  acoustically  needed.  DELCO  is  an  example  of  OPTIMIZED  acoustic  coding. 

In  this  document  only  th6  format  of  the  SIMPLE  acoustic  coding  is  defined. 

All  the  transmitted  parameters  are  sent  as  pointers  into  agreed-upon  tables.  These 
tables  are  defined  as  two  lists  of  values.  The  transmitter  table  (X(J)}  is  used  in  the 
following  way:  The  value  V is  coded  as  the  code  J if  X(J-l)  < V £ X(J).  The  receiver 
table  {R( J)}  is  used  to  retrieve  the  value  R(J)  if  the  code  J was  received.  X(-l)  is 
implicitly  defined  as  minus-infinity,  and  X(Jmax)  is  explicitly  defined  as  plus-infinity. 

For  each  parameter,  {X< J)}  and  {R(J)}  may  be  defined  independently. 

The  second  coding  issue  is  the  information  coding  technique.  The  SIMPLE 
(information-wise)  way  of  sending  the  information  is  to  use  binary  coding  for  the  codes 
representing  the  parameters.  The  OPTIMIZED  way  is  to  compute  distributions  for  each 
parameter  and  to  define  the  appropriate  coding.  It  is  very  probable  that  the  PITCH  and 
GAIN  will  be  decoded  absolutely  in  the  first  PARCEL  of  each  message,  and  incrementally 
thereafter. 

At  present,  only  the  SIMPLE  (information-wise)  coding  used. 

The  details  of  the  LPC  data  protocol  and  its  Tables  Set-#1  can  be  found  in 
Appendix  1. 


The  following  is  the  definition  for  the  format  of  the  SIMPLE-SIMPLE  coding,  according  to 
Tables-Set-#1: 

For  each  parcel: 


PITCH 

6 bits  (PITCH=0  for  UNVOICED) 

GAIN 

5 bits 

/(1> 

7 bits 

1(2) 

7 bits 

1(3) 

6 bits 

/(4) 

6 bits 

/( 5) 

5 bits 

K6) 

5 bits 

1(7) 

5 bits 

KB) 

5 bits 

KB) 

5 bits 

K 10) 

5 bits 

where  each  of  the  /(j)  is  an  index  for  inverse  sine  coding.  If  Kj  » arcsin(^)  and  N bits 
are  assigned  for  its  transmission,  then  /(j)  - 2^0j fir. 

Hence  at  each  transmission  interval  (128  samples  times  150  microseconds)  67  bits 
are  sent,  which  results  in  a data  rate  of  3490  bps.  Since  this  bandwidth  is  well  within  the 
capabilities  of  the  network,  SIMPLE-SIMPLE  coding  is  used,  which  requires  the  least 
computation  by  the  HOSTS.  Note  that  this  data  rate  is  a peak  rate,  without  the  use  of 
silence. 


EL- >— ^fi>^..- 


-.■ 
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4.  EXAMPLES  FOR  THE  CONTROL  PROTOCOL 


Here  is  an  example  for  a connection: 

(377)  C:  1 ,<WH0>,<WH0M>,340 
(340)  A:  2,1 

Another  example: 

(377)  C:  1 ,<WH0>,<WH0M>,3 60 

(360)  A:  6,350 

(350)  C:  1 ,<WH0>,<WH0M> 

(360)  A:  3, 1,1, 2 

(350)  C:  12,1 
(360)  A:  13,1 
(350)  C:  3, 1,1, 2 
(36C)  A:  5,1,1 
(350)  C:  3, 1,1, 3 
(360)  A:  5,1,1 


Please  talk  to  me  on  340/341. 
I refuse,  since  I’m  busy. 


Please  talk  to  me  on  360/361. 

OK.  You  talk  to  me  on  350/351. 

I want  to  talk  to  you. 

Can  you  do  CVSD?  (ANSWERER  tries 
to  be  the  NEGOTIATION  MASTER) 

I want  to  be  it. 

That’s  OK  with  me. 

Can  you  do  CVSD? 

No,  but  I can  do  LPC. 

Can  you  do  RELP? 


No,  but  I can  do  LPC. 
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(350) 

C:  3,1,1, 1 

How  about  LPC? 

(360) 

A:  4,1,1 

LPC  is  fine  with  me. 

(350) 

C:  3,2,1,150 

Can  you  use  150  microseconds  sampling? 

(360) 

A:  4,2,150 

1 can  use  150  microseconds. 

(350) 

C:  3,4,3,976,1040,2016 

Can  you  use  976,  1040,  or  2016  bits/msg? 

(360) 

A:  4,4,976 

1 can  use  976. 

(350) 

C:  3,5,1,10 

Can  you  send  10  coefficients? 

(360) 

A:  4,5,10 

1 can  send  10. 

(350) 

C:  3,6,1,64 

Can  you  use  a 64  sample  transmission? 

(360) 

A:  4,6,64 

1 can  use  64. 

(350) 

C:  3, 7, 2, 1,2 

SIMPLE  or  OPTIMIZED  acoustic  coding? 

(360) 

A:  4,7,2 

OPTIMIZED! 

(350) 

C:  3, 8, 1,1 

Can  you  do  SIMPLE  info  coding? 

(360) 

A:  4,8,1 

1 can  do  SIMPLE. 

(350) 

C:  3,9,1,58 

H - 0.90625? 

(360) 

A:  4,9,58 

Fine  with  me. 

(350) 

C:  3,10,1 

Tables-Set  #1? 

(360) 

A:  4,10,1 

Of  course! 

(350) 

C:  6 

1 am  ready.  (Note:  No  "RINGING"  sent) 

(350)  C:  8 


And  you? 


(360)  A: 6 


I am  ready,  too. 


Data  Is  exchanged  now, 

on  351  and  361. 

(350) 

C:  10,1234 

Echo  it,  please. 

(360) 

A:  11,1234 

Here  it  comes! 

(360) 

A:  10,3333 

Now  ANSWERER  wants  to  measure 

(350) 

C:  11,3333 

...the  delays,  too. 

(???) 

X:  2,3 

Termination  by  either  user. 

Another  example: 


(377) 

C:  1 ,<WH0>,<WH0M>,360 

Please  talk  to  me  on  360/361. 

(360) 

A:  6,340 

Fine.  You  send  on  340/341. 

o 

03 

C:  1,<WH0>,<WH0M> 

1 want  to  talk  to  you. 

(360) 

A:  3,3, 1,1 

Can  you  use  VI? 

o 

«T 

03 

C:  4,3,1 

Yes,  VI  is  OK. 

(360) 

A:  3,4,1,1984 

Can  you  use  up  to  1984  blts/msg? 

(340) 

C:  5,4,976 

No,  but  1 can  use  976. 

18 


(360)  A:  3,4,1,976 
(340)  C:  4,4,976 
(360)  A:  9 


Can  you  use  up  to  976  bits/msg? 

I can  use  976. 

Ringing  (note  how  short  this  negotiation  is!!). 


(340)  C:  8 
(360)  A:  9 


Still  there? 
Still  ringing. 


(340)  C: 8 
(360)  A:  9 


Still  there? 
Still  ringing, 


(340)  C:  8 How  about  it? 

(360)  A:  9 Still  ringing. 


(340)  C:  2 


Forget  it!  (No  reason  given.) 
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Appendix  1 


THE  DEFINITION  OF  TABLES-SET-#! 


by 

John  D.  Markel 

Speech  Communications  Research  Laboratory 
Santa  Barbara,  California 
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TADLES-SET  «l 


This  set  includes  tables  for 


PITCH  ■ 

■ 64  values, 

PITCH  table 

GAIN  - 

■ 32  values, 

GAIN  table 

/(  1>- 

128  values, 

INDEX7  table 

/(  2)  - 

128  values, 

INDEX 7 table 

/{  3)  - 

64  values, 

INDEX6  table 

/(  4)  - 

64  values, 

INDEX6  table 

/(  5)  - 

32  values, 

INDEX5  table 

/(  6)  - 

32  values, 

INDEX5  table 

/(  7)- 

32  values, 

INDEX5  table 

/<  8)  - 

32  values, 

INDEX5  table 

/<  9)  - 

32  values, 

INDEX5  table 

/(10)  - 

32  values, 

INDEX5  table 

These  tables  are  defined  specifically  for  a sampling  period  of  150  microseconds. 
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GENERAL  COMMENTS 

The  following  tables  are  arranged  in  three  columns,  (X(j)},  {j},  and  {R(j>}.  Note  that 
the  entries  in  the  {X(j)}  column  are  half  a step  off  the  other  columns.  This  is  to  indicate 
that  INTERVALS  from  X-domain  (pitch,  gain,  and  the  Ks)  are  mapped  into  CODES  {j},  which 
are  transmitted  over  the  network,  to  be  translated  by  the  receiver  into  the  (R(j)}.  These 
intervals  are  defined  as  OPEN-CLOSE  intervals.  For  example,  the  PITCH  value  (at  the 
transmitter)  of  4131  belongs  to  the  interval  "(4024,4131]",  hence  it  is  codec  j«6  which 
is  mapped  by  the  receiver  to  the  value  21.  Similarly,  the  value  of  2400  for  INDEX7  is 
found  to  belong  to  the  interval  "(2009,2811]",  coded  into  the  CODE  3 and  mapped  back 
into  241 1. 

Note  that  if  N bits  are  used  by  a certain  CODE,  then  there  are  2^+1  entries  in  the 
X-table,  but  only  2^  entries  in  the  R-table. 

The  tr  nsformation  values  used  for  PITCH,  GAIN,  and  the  K-parameters  (in  the 
X-  and  R-tables)  are  as  defined  in  NSC  Note  42. 

Values  above  and  below  the  rcnge  of  he  X-table  are  mapped  into  the  maximum  and 
minimum  table  Indices,  respectively. 

Note  that  R(J)  of  INDEX5  is  identical  to  R(2J)  of  INDEX6,  and  that  R(J)  of  INDEXb  is 
identical  to  R(2J)  of  INDEX7.  Therefore,  it  is  possible  to  store  only  the  R-table  of  INDEX7, 
without  the  R-tables  of  INDEX5  and  INDEX6. 

In  the  SPS-41  implementation  there  is  no  need  to  store  any  R-table  for  the 
K-parameters.  The  transmitted  index  can  be  used  directly  (with  the  appropriate  scaling) 
as  an  index  into  the  SPS  bu;'t-in  TRIG  tabhs. 

COMMENTS  ON  THE  PITCH  TABLE 

The  level  J«0  defines  the  UNVOICED  condition.  The  receiver  maps  it  into  the 
number  of  samples  per  frame  (here  128). 

This  PITCH  table  differs  significantly  from  previous  tables  and  supersedes  the  table 
published  in  NSC  Note  36.  Details  of  the  calculation  of  the  table  can  be  found  in  NSC  Note 
42.  Immediate  questions  should  be  referred  to  John  Markel. 


<w»7  p 
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COMMENTS  ON  THE  C/IIN  TABLE 

The  level  J-0  defines  absolute  silence. 

This  table  is  designed  for  a maximum  of  12-bit  A/D  input,  and  allows  for  a dynamic 
range  of  43.5  dB. 

NSC  Notes  36,  45,  56  and  58  supply  background  for  the  GAIN  table.  GAIN  is  the 
energy  of  the  pre-emphasized,  windowed  signal. 

This  table  is  the  NEW  GAIN  table.  NSC  Notes  56  and  58  explain  the  reasoning 
behind  the  NEW  GAIN. 

COMMENTS  ON  THE  INDEX7  TABLE 

Positive  values  are  coded  into  the  range  [0-63,  decimal].  Negative  values  are  coded 
into  the  7-bits  two’s  complement  of  the  codes  of  their  absolute  value  [65-127,  decimal]. 

Note  that  all  values  -403  < V < 403  are  coded  as  (and  mapped  into)  0.  Note  also 
that  the  code  -64  (100  octal)  is  never  used. 

In  SPS-41  implementation,  the  R-table  is  not  needed,  since  TRIG(2J)  is  the  needed 
value  R(J). 

COMMENTS  ON  THE  INDEX6  TABLE 

Positive  values  are  coded  into  the  range  [0-31,  decimal].  Negative  values  are  coded 
into  the  6-bits  two’s  complement  of  the  codes  of  their  absolute  values  [33-63,  decimal]. 

Note  that  all  values  -805  < V < 805  are  coded  as  (and  mapped  into)  0.  Cote  also 
that  the  code  -32  (40  octal)  is  never  used. 

In  SPS-41  implementation,  the  R-table  is  not  needed,  since  TRIG(4J)  is  the  needed 
value  R(J). 


-*T' 

I 


Mir* 
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COMMENTS  ON  THE  INDEXS  T/HliE 


Positive  numbers  are  coded  into  the  range  [0-15,  decimal].  Negative  numbers  are 
coded  into  the  5-bits  two’s  complement  of  their  absolute  values,  i.e.,  [’.7-31,  decimal]. 


Note  that  all  values  -1609  < V < 1609  are  coded  as  (and  mapped  into)  0.  Note 
also  that  the  code  -16  (20  octal)  is  never  used. 


In  SPS-41  implementation,  the  R-table  is  rot  needed,  since  TRIG(SJ)  is  the  needed 
value  R(J). 


THE  PITCH  TABLE  (at  of  10-29-74) 


X (J) 

J 

R ( J) 

X (J) 

J 

R(J) 

X ( J) 

J 

R (J) 

0 

0 

128* 

6002 

21 

33 

10770 

42 

61 

0 

1 

18 

6168 

22 

34 

11080 

43 

63 

3630 

2 

19 

6338 

23 

35 

11399 

44 

65 

3724 

3 

19 

6515 

24 

36 

11728 

45 

57 

3821 

4 

20 

6696 

25 

37 

12067 

46 

69 

3921 

5 

20 

6883 

26 

38 

12417 

47 

71 

4024 

6 

21 

7075 

27 

39 

12776 

43 

73 

4131 

7 

22 

7274 

28 

40 

13147 

49 

75 

4240 

8 

22 

7478 

29 

41 

13529 

50 

77 

4353 

9 

23 

7689 

30 

43 

13922 

51 

80 

4469 

10 

24 

7905 

31 

44 

14327 

52 

82 

4588 

11 

24 

l ,21 

32 

45 

14745 

53 

85 

4711 

12 

25 

8359 

33 

47 

15175 

54 

87 

4838 

13 

26 

8596 

34 

48 

15618 

55 

90 

4969 

14 

27 

8840 

35 

50 

16075 

56 

93 

5104 

15 

27 

9092 

36 

51 

16545 

57 

95 

5242 

16 

28 

9351 

37 

53 

17029 

58 

98 

m iwm* 


X(J) 

J 

R(J) 

X(J) 

J 

R ( J) 

X(J) 

J 

R ( J) 

5385 

17 

29 

9618 

38 

54 

17529 

59 

101 

5533 

18 

30 

9894 

39 

56 

18043 

60 

104 

5684 

19 

31 

10177 

40 

57 

18572 

61 

107 

5841 

20 

32 

10469 

41 

59 

19118 

62 

111 

6002 

10770 

19681 

63 

114 

00 

Note: 

This  table  has  only  58  different  intervals  defined,  since  5 values  are  repeated  in 

the  R(j)  table. 

*This  value  is  the  "Transmission  Interval" 

(measured  in  samples) 

;;  defined  in  item  #6  of 

the  NEGOTIATION. 

. . --L 


7 ” 'WS-’ 


1 ■ ...II  - — — — — — — 


26 


THE  CAIN  TABLE  (as  of  9-17-75) 


X (J) 


J R ( J) 


X (J) 


J R (J) 
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THE  INDEX 7 TABLE  (at  of  9-23-74) 


X(J) 

J 

R (J) 

X(J) 

0 

0 

0 

15800 

402 

1 

804 

16500 

1206 

2 

1608 

17190 

2009 

3 

2411 

17869 

2811 

4 

3212 

18538 

3612 

5 

4011 

19195 

4410 

6 

4808 

19841 

5205 

7 

5602 

20475 

5998 

8 

6393 

21097 

6787 

9 

7180 

21706 

7571 

10 

7962 

22302 

8351 

11 

8740 

22884 

9127 

12 

9512 

23453 

9896 

13 

10279 

24008 

10660 

14 

11039 

24548 

11417 

15 

11793 

25073 

12167 

16 

12540 

25583 

12910 

26078 

J 

R ( J) 

X(J) 

J 

R (J) 

27897 

21 

16151 

28311 

42 

28106 

22 

16846 

28707 

43 

28511 

23 

17531 

29086 

44 

28899 

24 

18205 

29448 

45 

29269 

25 

18868 

29792 

46 

29622 

26 

19520 

30118 

47 

29957 

27 

20160 

30425 

48 

30274 

28 

20788 

30715 

49 

30572 

29 

21403 

30986 

50 

30853 

30 

22006 

31238 

51 

31114 

31 

22595 

31471 

52 

31357 

32 

23170 

31686 

53 

31581 

33 

23732 

31881 

54 

31786 

34 

24279 

32058 

55 

31972 

35 

24812 

32214 

56 

32138 

36 

25330 

32352 

57 

32286 

37 

25833 

32470 

58 

32413 
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APPENDIX  2 

IMPLEMENTATION  RECOMMENDATIONS 


1.  It  is  recommended  that  the  priority-bit  be  turned  ON  in  the  HOST/IMP  header. 

2.  It  is  recommended  that  in  all  abbreviations,  "R"  be  used  for  Receiver  and  "X"  for 
Transmitter. 


3.  The  following  identifiers  and  values  are  recommended  for  implementations: 


Si.  NOTH  30  SILENCE-THRESHOLD. 

Used  for  LONG-SILENCE  definition.  See  below.  Measured  'n  the 
same  units  as  GAIN,  in  its  X-table. 

T33  1.000  sec  TIME-BEGIN-SILENCE. 

LONG-SILENCE  is  declared  if  GAIN<SLNCTH  for  more  than  TBS. 


TAS  0.500  sec  TIME-AFTER-SILENCE. 

A delay  introduced  by  the  receiver  after  the  end  of 
LONG-SILENCE,  before  restarting  the  playback. 

TES  0.150  sec  TIME-END-SILENCE. 

The  amount  of  time  the  transmitter  backs  up  at  the  end  of  a 
LONG-SILENCE  in  order  to  ensure  a smooth  transition  back  to 
speech. 

TRI  2.000  sec  TIME-RESPONSE-INITIAL. 

Time  for  waiting  for  response  for  an  initial  call  (#1  and  #3).  The 
initial  call  is  repeated  every  TRI  until  an  answer  arrives,  or  until 
TRIGU  expires. 


TRIGU  20.000  sec  TIME-RESPONSE-INITIAL-GIVEUP. 

If  no  response  to  an  initial  call  is  received  within  TRIGU  after 
the  FIRST  initial  call,  the  system  gives  up,  assuming  the  other 
system  is  down. 


TRQ  1.000  sec  TIME-RESPONSE-INQUIRY. 

If  no  response  to  an  inquiry  (#8)  is  received  within  TRQ,  the 
inquiry  is  repeated. 
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TRQGU  10.000  sec  TIME-RESPONSE-INQUIRY-GIVEUP. 

If  no  response  to  an  inquiry  is  received  within  TRQGU  from  the 
FIRST  inquiry,  the  system  gives  up,  assuming  the  other  system  is 
down. 


TBDA  3.000  sec  TIME-BETWEEN-DATA-ARRIVAL. 

If  no  data  arrives  within  TBDA,  an  INQUIRY  (#8)  is  sent.  This 
repeats  every  TBDA. 

TNR  2.000  sec  TIME-NOT-READY. 

If  the  other  system  is  in  the  NOT-READY  (#7)  state  for  more  than 
TNR,  an  INQUIRY  (#8)  is  sent.  This  repeats  every  TNR. 


TNRGU  10.000  sec  TIME-NOT-READY-GIVEUP. 

If  the  other  system  is  in  the  NOT-READY  (#7)  state  for  more  than 
TNRGU,  then  the  system  gives  up,  assuming  the  other  system  is 
down. 


TBIN  3.000  sec  TIME-BUFFER-IN. 

The  input  buffer  size  is  equivalent  to  the  time  period  TBIN  (and 
its  size  is  the  DATA-RATE  multiplied  by  the  period  TBIN).  If  the 
INPUT  QUEUE  ever  gets  to  be  longer  than  TBIN,  data  is  discarded. 

TBOUT  3.000  sec  TIME-BUFFER-OUT. 

The  output  buffer  size  is  equivalent  to  the  time  period  TBOUT 
(and  its  size  is  the  DATA-RATE  multiplied  by  the  period  TBOUT). 
If  the  OUTPUT  QUEUE  ever  gets  to  be  longer  than  TBOUT,  data  is 
discarded. 
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